Exploiting prosodic features for dialog act tagging in a discriminative modeling framework
نویسندگان
چکیده
Cue-based automatic dialog act tagging uses lexical, syntactic and prosodic knowledge in the identification of dialog acts. In this paper, we propose a discriminative framework for automatic dialog act tagging using maximum entropy modeling. We propose two schemes for integrating prosody in our modeling framework: (i) Syntaxbased categorical prosody prediction from an automatic prosody labeler, (ii) A novel method to model continuous acoustic-prosodic observation sequence as a discrete sequence through the means of quantization. The proposed prosodic feature integration results in a relative improvement of 11.8% over using lexical and syntactic features alone on the Switchboard-DAMSL corpus. The performance of using the lexical, syntactic and prosodic features results in an dialog act tagging accuracy of 84.1%, close to the human agreement of 84%.
منابع مشابه
Combining lexical, syntactic and prosodic cues for improved online dialog act tagging
Prosody is an important cue for identifying dialog acts. In this paper, we show that modeling the sequence of acoustic– prosodic values as n-gram features with a maximum entropy model for dialog act (DA) tagging can perform better than conventional approaches that use coarse representation of the prosodic contour through summative statistics of the prosodic contour. The proposed scheme for expl...
متن کاملTraining a prosody-based dialog act tagger from unlabeled data
Dialog act tagging is an important step toward speech understanding, yet training such taggers usually requires large amounts of data labeled by linguistic experts. Here we investigate the use of unlabeled data for training HMM-based dialog act taggers. Three techniques are shown to be effective for bootstrapping a tagger from very small amounts of labeled data: iterative relabeling and retrain...
متن کاملDialog Act Modeling for Automatic Tagging and Recognition of Conversational Speech
We describe a statistical approach for modeling dialog acts in conversational speech, i.e., speechact-like units such as Statement, Question, Backchannel, Agreement, Disagreement, and Apology. Our model detects and predicts dialog acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialog act sequence. The dialog model is based on treating the d...
متن کاملDirect Modeling of Prosody: An Overview of Applications in Automatic Speech Processing
We describe a “direct modeling” approach to using prosody in various speech technology tasks. The approach does not involve any hand-labeling or modeling of prosodic events such as pitch accents or boundary tones. Instead, prosodic features are extracted directly from the speech signal and from the output of an automatic speech recognizer. Machine learning techniques then determine a prosodic m...
متن کاملA Frame-Synchronous Prosodic Decoder for Text-Independent Dialog Act Recognition
Dialog act (DA) recognition is an important intermediate task is speech understanding systems. Although past research has demonstrated that prosody can improve the performance of recognizers relying primarily on words, how prosody fares on its own is not well understood. The current work continues an ongoing investigation into settings in which both words and word boundaries are unavailable, wh...
متن کامل